Bugfix #23 and improvements as suggested in #48 #50

arizon-dread · 2024-06-19T08:10:03Z

Bugfix and suggestions for improvement implemented

Fixes #23 by checking for error from clamd.ScanStream, which closes the connection if the file size is exceeded. The error was previously ignored. This PR includes code that assumes that a closed connection is caused by file size exceeded. Unfortunately, there is no way to detect the ^INSTREAM: file size limit exceeded error from the clamd process in the api. A custom clamd.ScanResult is created inside the error handling if statement, to handle the response in the same switch/case logic as other responses. This is also handled in the scanHandlerBody func the same way, so the client actually get a response.

Fixes #48 by creating a new endpoint (/v2/scan) and a scanResult struct that contains status, description and httpStatus (httpStatus is ignored in the json annotation, and only used in the code logic). An array of all scanned files (as []scanResult) is then marshalled to json, creating a proper json response for one to many files, returning an array of json objects to the client. The old /scan endpoint will also use this response (which is formatted the same way as before) but it will not be returned in an array, but as before, one to many json objects without proper json array structure, to keep previous behavior intact. Although, deprecation and link headers indicating that there's a new endpoint available, is returned from the old endpoint.

Previously, the first file's status would be the http status of the entire response, this PR includes code that will always return a 406 http status if any file contains a virus, and only return a 200 OK status if all files are clean, for both the old and new endpoint. I figured this was a bug.

I also added a prometheus metrics counter that increments on each found virus.

I also added go.mod and go.sum to use go modules instead of vendor directory etc.

… the first file's status regardless of the next files. It now returns the most severe status (virus detected) and only 200OK if all files are clean

…calculating http status later

… security issues w/ 1.22.3

…2/scan endpoint response

Fixes ajilach#56

arizon-dread · 2024-10-17T12:11:35Z

I got this working on my local machine yesterday after merging the original repo's master branch into my fork's scan-v2, successfully tested all the endpoints. Then I built the docker image and when starting that up, it can't parse the clamd.conf nor the freshclam.conf. I looked into it and it didn't look off from what I could see, although the "Foreground" setting seems to take "yes" as the value rather than "true", but changing that in the sed command in the dockerfile did nothing to solve the issue, I will look into it further tonight.

arizon-dread · 2024-10-17T20:33:35Z

@davosian Do you have any insights into this issue parsing the config files? It seems lilke the version that gets installed in alpine 3.20 is 1.22.r0 which is a little bit weird considering it's not a LTS release, but a release candidate.

davosian · 2024-11-13T07:38:28Z

Hi @arizon-dread,

I finally got around going through the latest PR so that is about time get your changes merged into the project. Since you were kind enough to review PR #59, I was wondering whether your tests included the changes from this PR as well?

Great to hear about your openness to support the project! As you can tell my reactions are on the slow end (which I am hoping to improve) so any support I can get to keep this project active is a very welcome change.

I also just checked the version included in Alpine Edge and even there it is a release candidate (1.3.2-r0). Not sure though whether this is normal for Alpine's Edge versions.

… was merged

arizon-dread · 2024-11-13T19:52:09Z

Hi @davosian

My tests in #59 did not include my code in this PR, I tested the PR's code as it was indented to be merged (checked out the branch from @christianbumann's fork). I have now merged the master branch into my scan-v2 after the PR merge of #59 and built it. Building and starting the container works but it seems I need to wait until tomorrow before new signatures are published so I can verify the version update, like @christianbumann had to do when he was working with #59, so I'll get back with results tomorrow.

Yes, the Alpine repos seems to be a bit off with the clamav versions, if you check the official clamav download page: clamav, they promote 1.4.1 LTS and 1.0.7 LTS. The 1.4.1 version is also the version on the official docker hub repo clamav dockerhub. That's why I was contemplating on if we should switch the base image to the official clamav image to get a LTS release of it instead, but if we decide to do that, I think it needs to be it's own PR from a clean fork from master. Not mixed with other stuff like the changes I have made in this PR. In that case, we need to make sure to look through the config even more thoroughly so they haven't made breaking changes or changed syntax, which would break our sed approach when building this image.

I'll get back tomorrow with version update results!

…d b/c of missing repos

…natures on startup

…ript for settings management

arizon-dread · 2024-11-14T21:50:42Z

It seems to be working as expected. I also fixed the centos.Dockerfile which was very outdated and wouldn't even build. Centos:stream8 is EOL and the repos are offline so I upgraded to stream9 and modified the dockerfile to create the folder structure used in the standard docker file to get it up and running (to match the entrypoint.sh script). The centos base image has clamav 1.0.7 in the repos so the version is slightly older.

… in ajilach#62 which can now be closed

…into pr/arizon-dread/50

davosian

I found that the prometheus metric no_of_found_viruses only considers the new /v2/scan endpoint. Is this by design?

davosian

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

arizon-dread · 2025-01-08T20:58:47Z

@davosian

I found that the prometheus metric no_of_found_viruses only considers the new /v2/scan endpoint. Is this by design?

I have just not thought it through completely, of course, every virus hit should be recorded as an uptick, regardsless of which endpoint finds it. I initially had an Idea to build a middleware that intercepts all responses on the way back to the client and if the status code is 406, increment the prometheus counter, but after having looked into this I've realized it's not as straight forward as i initially thought. Creating a middleware in go is fairly easy but it seems that the StatusCode field is not an accessible field on the http.ResponseWriter when it's returning to the client. I would say that since we already have another PR #51, suggesting that we uniform responses and utilize the same func for all the endpoints to determine the response status. My suggestion is that we centralize the prometheus virus ticker logic there instead. I have already started going in that direction with my getResponseStatus func. The logical way should be to create the uptick if the response status 406 is registered for the response. Let me see what I can do on this.

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

These seems to be the default values in the clamd.conf file out of the box. It looks like that on my machine too. I'm not sure why though.

…no_of_found_viruses prometheus counter

arizon-dread · 2025-01-08T22:36:30Z

I have done some late night refactoring to be more uniform in how it works with traversing the statuses, setting the correct http status and incrementing the virus found ticker. I will not be changing the response status of the scanPathHandler, it has always returned 200, I only added a check for each virus found, it will also increment the counter.

The /scanHandlerBody, /scan and /v2/scan now utilize the same funcs for translating clam.RES_${status} to http status code, and the same func for evaluating statuses and then writing the statuscode header in the same manner. This is also where the prometheus incrementation happens for those tree endpoints. Hopefully a bit more uniform than before. I have tried to keep the backwards compatibility intact to not break stuff.

davosian · 2025-01-09T14:14:44Z

The default for MAX_SCAN_SIZE is 100M, the default for MAX_FILE_SIZE is 25M. To me it would logically make more sense to set the default for MAX_FILE_SIZE equal or larger than MAX_SCAN_SIZE. Otherwise, MAX_SCAN_SIZE would never be hit. Or am I missunderstanding something?

These seems to be the default values in the clamd.conf file out of the box. It looks like that on my machine too. I'm not sure why though.

I checked again the defaults and usage in clamav itself:

# This option sets the maximum amount of data to be scanned for each input
# file. Archives and other containers are recursively extracted and scanned
# up to this value.
# Value of 0 disables the limit
# Note: disabling this limit or setting it too high may result in severe damage
# to the system.
# Default: 400M
#MaxScanSize 1000M

# Files larger than this limit won't be scanned. Affects the input file itself
# as well as files contained inside it (when the input file is an archive, a
# document or some other kind of container).
# Value of 0 disables the limit.
# Note: disabling this limit or setting it too high may result in severe damage
# to the system.
# Technical design limitations prevent ClamAV from scanning files greater than
# 2 GB at this time.
# Default: 100M
#MaxFileSize 400M

I am still not sure why MaxScanSize defaults to a higher value than MaxFileSize, but my take on it is that MaxScanSize targets compressed formats like archives and MaxFileSize then limits the scan size of each file contained in it.

In other words: as you already pointed out @arizon-dread , we are applying the same logic as the clamav project itself and therefore we should stick to it.

davosian

I did some more testing around the old and new endpoints. The service responded as expected and the counter behaved accordingly, resetting with a fresh start of the container, honoring the MAX_FILE_SIZE and counting up for each virus found independently of the endpoint used. Therefore, I would say, we are ready to merge this PR. 💪

arizon-dread added 16 commits June 18, 2024 15:37

update ioutils -> io

da30491

format

4630017

add v2 endpoint and ajust logic to match

76ed498

prometheus counter for found viruses

71dce87

format printf

4aa1463

add endpoint

e5a5854

go.mod and go.sum, update to go 1.22

ee7337c

the rest of vendor stuff

b1c4305

ignore debugging executables

f697aa9

add struct for responses with proper json annotations

1bbcba8

ignore vscode stuff

787793e

fix response for multi file stream scanning so that it doesn't return…

869695f

… the first file's status regardless of the next files. It now returns the most severe status (virus detected) and only 200OK if all files are clean

fix http status handling for missing filename in multipart header

26b7701

remove writeheader from filename error handling, it will be set when …

ee4a157

…calculating http status later

update comment

e382100

removed vendor to use modules instead

86574ec

arizon-dread marked this pull request as draft June 19, 2024 09:00

remove vendor and use only go modules

3a85156

arizon-dread marked this pull request as ready for review June 19, 2024 09:08

Erik Svensson and others added 9 commits July 2, 2024 09:46

bump go alpine image to 3.20 + bump go patch version to 1.22.4 b/c of…

329fbee

… security issues w/ 1.22.3

update readme, use /v2/scan in the documentation

c2cac96

Encapsulate json in array brackets in examples to adher to the new /v…

32f3d1b

…2/scan endpoint response

Use the same data folders for freshclam and clamd

fb74018

Fixes ajilach#56

chore(deps): update alpine docker tag to v3.20

1b8dc98

chore(deps): update docker/build-push-action action to v6

1f68fb3

Update README.md

b72a12f

Merge remote-tracking branch 'clamav-rest/master' into scan-v2

12cd48f

update status codes

a1a8396

escape the ? in sed for NotifyClamd

1cd0740

Merge branch 'master' into scan-v2 after topic/bch/56-fix-notify-path…

ac92f6b

… was merged

arizon-dread added 4 commits November 14, 2024 22:43

update to centos:stream9 because centos:stream8 is EOL and won't buil…

79fef61

…d b/c of missing repos

use go modules. update to stream9

70dec07

add nc to trigger reload of database after 120 seconds to get new sig…

1726d5a

…natures on startup

create folder structure for clamav so it matches the entrypoint.sh sc…

27e718d

…ript for settings management

arizon-dread mentioned this pull request Dec 21, 2024

Anticipate bump of Clamav to 1.3.1-r0 that will come with next Alpine release 3.21 in November #53

Closed

Merge remote-tracking branch 'origin/master' into scan-v2

4aed176

arizon-dread assigned davosian Jan 7, 2025

update user group separator and uniform docker syntax to uppercase as…

aaf3ebd

… in ajilach#62 which can now be closed

arizon-dread mentioned this pull request Jan 8, 2025

Fix docker build #62

Closed

davosian and others added 3 commits January 8, 2025 09:08

minor docker syntax fix

4bcb9d7

update user group separator and uniform docker syntax to uppercase as…

30f6e23

… in ajilach#62 which can now be closed

Merge branch 'scan-v2' of https://github.com/arizon-dread/clamav-rest …

501b207

…into pr/arizon-dread/50

davosian reviewed Jan 8, 2025

View reviewed changes

davosian added 3 commits January 8, 2025 16:23

Minor documentation improvements

1e9abd1

added project history and attribution

c1013d8

Prepared update documentation

04a55fc

arizon-dread added 2 commits January 8, 2025 22:50

unify responses

ad922dc

unify response http status and make sure all endpoints increment the …

a83dc6b

…no_of_found_viruses prometheus counter

davosian approved these changes Jan 9, 2025

View reviewed changes

davosian merged commit 82744fe into ajilach:master Jan 9, 2025
1 check passed

SamuelWei mentioned this pull request Jan 17, 2025

Add clamav antivirus THM-Health/PILOS#1133

Open

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix #23 and improvements as suggested in #48 #50

Bugfix #23 and improvements as suggested in #48 #50

arizon-dread commented Jun 19, 2024 •

edited

Loading

arizon-dread commented Oct 17, 2024 •

edited

Loading

arizon-dread commented Oct 17, 2024

davosian commented Nov 13, 2024

arizon-dread commented Nov 13, 2024 •

edited

Loading

arizon-dread commented Nov 14, 2024

davosian left a comment

davosian left a comment

arizon-dread commented Jan 8, 2025

arizon-dread commented Jan 8, 2025 •

edited

Loading

davosian commented Jan 9, 2025

davosian left a comment

Bugfix #23 and improvements as suggested in #48 #50

Bugfix #23 and improvements as suggested in #48 #50

Conversation

arizon-dread commented Jun 19, 2024 • edited Loading

Bugfix and suggestions for improvement implemented

arizon-dread commented Oct 17, 2024 • edited Loading

arizon-dread commented Oct 17, 2024

davosian commented Nov 13, 2024

arizon-dread commented Nov 13, 2024 • edited Loading

arizon-dread commented Nov 14, 2024

davosian left a comment

Choose a reason for hiding this comment

davosian left a comment

Choose a reason for hiding this comment

arizon-dread commented Jan 8, 2025

arizon-dread commented Jan 8, 2025 • edited Loading

davosian commented Jan 9, 2025

davosian left a comment

Choose a reason for hiding this comment

arizon-dread commented Jun 19, 2024 •

edited

Loading

arizon-dread commented Oct 17, 2024 •

edited

Loading

arizon-dread commented Nov 13, 2024 •

edited

Loading

arizon-dread commented Jan 8, 2025 •

edited

Loading